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Abstract 

o 

We consider the problem of portfolio optimization in the presence of mar- 
ket impact, and derive optimal liquidation strategies. We discuss in detail the 
problem of finding the optimal portfolio under Expected Shortfall (ES) in the 
^ case of linear market impact. We show that, once market impact is taken 

into account, a regularized version of the usual optimization problem natu- 
rally emerges. We characterize the typical behavior of the optimal liquidation 
strategies, in the limit of large portfolio sizes, and show how the market impact 
removes the instability of ES in this context. 
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1 Introduction 



The optimization of large portfolios is known to be highly unstable, due to large 
sample to sample fluctuations. This instability arises from the fact that risk mea- 
sures need to be estimated from empirical observations in the market and for large 
portfolios we can never have enough data: by the very nature of portfolio selec- 
tion, the sampling frequency cannot be high, and the look-back period cannot be 
too long. Therefore, the length T of the available time series cannot be sufficiently 
large compared to the number N of the different items in the portfolio: for large 
portfolios T may be, at best, of the same order of magnitude as N. Clearly, classical 
statistical methods cannot be expected to work in this regime. This difficulty has 
been well known in portfolio theory, and a host of methods have been put forward to 
handle it [1-21 , but the problem was put in a particularly sharp light when it was 
recognized that for a critical value of the ratio N/T the estimation error actually 



diverges 22,23 



A similar statistical problem arises in many other areas, and methods known 
as regularization can be used as a remedy, see 24 and references therein. In the 



context of portfolio selection, regularization penalizes large excursions of the portfolio 
weight vector by imposing a constraint on the length, as measured in terms of a 
suitably chosen norm [24] . Regularized portfolio optimization, in general, then has 
two choices to make |24| : (i) which risk function to use, and (ii) which Lp-norm to 
use as a regularizer. 



Regularized portfolio optimization was studied in 24 with the increasingly popu- 



lar risk measure Expected Shortfall (ES), and the L2-norm. This scheme was shown 

and 



in 24 to be related to support vector regression 25 , 26 



24 introduced the 



necessary modifications to the support vector machine algorithm which are required 
by the asymmetry of the ES risk measure and by the extra constraint imposed by 
the fixed budget. A number of other filtering techniques that use the variance as a 
risk measure also implement regularization (5,27 , even when the link to statistical 
learning theory is not made explicit [j] 

In this paper we 



The choice of the risk measure is a subject of ongoing debate 28 



focus on the optimization of portfolios under the Expected Shortfall. ES is the mean 
loss above a high quantilej^] The reason for our choice is that ES provides a more 
accurate estimate of risk when returns have fat tailed distributions [29|-[33] and it gives 



a more faithful representation of large losses than Value at Risk (VaR) (34 35 that 



1 This includes works related to covariance shrinkage [6]-[9 15 . For a discussion see also and 
references therein. 

2 Note the sign convention: in the context of risk measures losses are counted positive. 
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can be identified by the quantile itself. In addition, ES can be computed by fast linear 



programming algorithms 36 and, most significantly, it was shown 37 39 to belong 
to the set of coherent risk measures |40|. These properties make it an attractive 



alternative to VaR which has been criticized for its lack of convexity 40 42 . 

As for the choice of the regularizer, one has to realize that various regularizers 
act differently. The L2 norm has a tendency to favor equal weights, and thus acts 
as a diversification pressure 24,43 . The LI norm can lead to more sparse solutions, 
see e.g 



positions 



44 


45 . 


i 15 


27 



In portfolio optimization it corresponds to an exclusion of short 
We refer the interested reader to 27 , where its use with the 



regularizer is not merely a matter of mathematical convenience: one has to motivate 
it on the basis of the nature of the problem at hand. In a Bayesian context, the 
choice of the regulariser expresses prior information, as it reflects what we know, or 
believe to know, about the possible structure in the data. 

In this paper we suggest that regularization inherently arises from considering 
the market impact of portfolio liquidations strategies, and we show how a linear 
assumption for the market impact leads to the L2 norm as a regularizer. 

The fact that liquidity considerations effectively regularize portfolio selection is 
quite natural, in view of the origin of the instability and of the no-arbitrage hypothe- 



sis. In brief, it was shown 47 48] that the instability of risk measures arises because, 
when portfolio returns are estimated on a data set of finite length, then an accidental 
arbitrage may appear, corresponding to a zero cost portfolio which happens to have 
a positive return on that particular data set. When this occurs, the minimization of 
risk measures that have no lower bound (like ES, for example) dictates to take an 
infinitely long position on these portfolios. However, in real markets, we expect that 
realized prices will react to the liquidation of a portfolio by adjusting prices in such 
a way as to eliminate accidental arbitrages. Therefore, taking the market impact 
of the liquidation into account as part of the portfolio optimization problem should 
intuitively regularize portfolio optimization. 

The paper is organised as follows. In Section [2] we define the Expected Shortfall 
risk measure, set up the problem of its optimization, discuss its instability, and recall 



how to overcome this difficulty by regularization, as suggested in 24 . 

We show in section [3] that regularization can be derived from estimating the risk 
of portfolio liquidation, within a linear assumption of market impact. This provides 
a clear financial motivation for choosing the L2-norm as a regularizer in the given 



3 L1 can produce a stable solution only if there is real structure in the data. If there is no 
structure (say all the items are more or less equivalent), then the solution with the LI norm will 



still be sparse, but unstable, as the set of zero weights will vary from sample to sample 46 . 
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problem. Note that this derivation does not depend on the choice of the risk function 
and that it therefore justifies and motivates the use of an L2-regularizer also in 
methods that use the variance as a risk measure. This market impact consideration is 
related to a recent proposal suggesting that portfolio optimization should be defined 
in terms of liquidation strategies (49] . 

Section [4] provides a concrete illustration of how the regularizer works in typ- 



ical cases. Following 50 , we derive analytic results for typical instances of large 



portfolios, generated from a simple distribution. This shows that regularization, as 



suggested in 24 , indeed removes the instability observed in [50], but it also shows 
the non-trivial role played by the confidence level: the more the risk measure is con- 
centrated in the tail of the distribution, the weaker is the regularization induced by 
liquidity considerations. 

The final section concludes with a short summary. Details of the derivation of 
the results presented in section [4] are relegated to an Appendix. 



2 Expected Shortfall, instability and regulariza- 
tion 

We consider a portfolio {wi ... wn} of N assets with returns {xi}, with Wi the posi- 
tion of asset i. We impose a global budget constraint J2f=i w i = w ^ an d we allow 
the weights {wi} to take any real value. The problem we address is that of finding 
the optimal weights {wi} that minimize the expected shortfall, defined as follows. 
Given the loss l{{wi}\{xi}) = — J2i w i x ii the probability for such loss to be smaller 
than a threshold a is 

P < {{w i },a)= fWdxMx^eia-liiw^ixi})), (1) 

with 8(x) = 1 if x > and 8(x) = otherwise. The associated /5VaR is defined as 

/3VaR({w;}) = min{a : P<({^}, a) > /3}, (2) 
while the Expected Shortfall ES({u>j}) is given by 

ES({w i }) = T ^g j n^({^})^(K}|{^})^(UK}|{^})-/3VaR(K})). (3) 

i 

The calculation of the ES [36] can be obtained through the minimization of the 
function 



Fp{{wi},v) =u + yZTr J Yl dx iP({ x i})l l ({ w i}\{ x i}) 



(4) 
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with respect to the auxiliary parameter v, where [x] + = (x + |x|)/2, i.e. 

ES({wi}) = mm v F(s({wi},v), (5) 

Approximating the integral in Q by sampling the probability distributions of re- 
turns, the problem can be reduced to the calculation of the minimum of the cost 
function 

T 

E[v,{u T }} = {l-(3)Tv + J2 u r 



T=l 



under the constraints 



u 



u T > Vr, 

N 



1=1 



and 



^2 W i = wN > 

i 

where we have introduced a new set of variables defined as 



Ut 



(-"-El, 



WiXi 



Ei=i) w ^i 



Portfolio optimization often assumes that weights should satisfy a further con- 
straint that fixes the expected return to some target value. However, it has been 
remarked (6, 51 that estimation error in sample expected returns is so large that 
nothing much is lost in ignoring this constraint altogether, with no appreciable effect 
on out-of-sample performance [3]. Also, in applications such as index tracking, where 
the objective is to mimic a benchmark portfolio as closely as possible, this constraint 



is not needed. Finally, the issues we address below 47 , the main argument and the 



results are not affected by the presence of a constraint on expected returns. So, for 
the sake simplicity, we omit this constraint in what follows. 



2.1 Instability of Expected Shortfall 

Being a conditional average, ES is not bounded from below: if a portfolio produces a 
large gain, rather than a loss, then ES takes a large negative value. Now, on a finite 
sample it may happen that one of the items, or a combination of items, dominates 
the others, i.e. produces a larger return at each time point than the rest. When 
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such an apparent arbitrage occurs, the optimization of ES suggests to go as long as 
possible in the dominating asset and correspondingly short in the dominated ones. 
In particular, if there are no other constraints except for the fixed budget, then this 
leads to a runaway solution and to a seemingly infinite return. Therefore, for finite 
N and T the optimization of ES will have a finite solution with a probability always 
less than one. This probability quickly approaches one as N/T goes to zero, and 
quickly approaches zero as N/T goes to infinity. The transition between the two 
limits becomes sharper and sharper as N and T go to infinity such that their ratio 
is fixed, which is the realistic limit to consider for large institutional portfolios. In 
this limit there will be a critical value of the ratio N/T where a sharp transition 
occurs between the region where the optimization of ES leads to a finite solution 
and the one where it does not. This instability of ES was pointed out in [23], where 
the critical N/T as function of the cutoff beyond which the conditional average is 
calculated (i.e. the phase diagram) was determined numerically, while [50 1 derived 
an analytic expression for it by the help of tools borrowed from the statistical physics 
of random systems. 

Although the ES is a specific case of the several possible risk measures, it was 
explicitly demonstrated 48 for all the coherent risk measures 40 that the presence 
of apparent arbitrages constitutes a sufficient condition for the portfolio optimization 
problem to be unfeasible. Moreover, the same instability was shown to characterize 



also portfolio optimization problems under downside risk measures 47 , including 
parametric VaR, one of the standard tools in banking. In the following, we will 
then focus on the specific case of ES with the idea that the results derived can be 
generalized to such wider classes of risk measures. 



2.2 Regularized Portfolio Optimization 

The essence of the problem discussed above, namely that the dimension (here N) 
is too high compared with the size of the statistical sample (T), is common to all 
fields where complex modeling or optimization problems arise. The field of statistical 

54] ) has developed powerful and sys- 
tematic methods to deal with the difficulty of insufficient data. The insights gained 
in that area can be directly applied to portfolio optimization, as pointed out in [24] . 

The main observation is that the instability is caused by over-fitting, which in 
turn is caused by the fact that the empirical risk is minimized in a regime in which 
there is not enough data to guarantee small actual risk. The weights {wi} constitute 
a linear model of the observed returns, and the model capacity has to be constrained 
in order to avoid over-fitting. The capacity of the linear model is monotonic in 



learning theory /machine learning (see e.g. 52 
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the length of the weight vector, so that minimizing that length results in better 
generalization performance. 

In the context of portfolio selection this means that the resulting portfolio bet- 
ter reflects the actual variations in the data by avoiding fluctuations due to small 
sample size effects. This line of reasoning leads to an optimization problem in which 
a modified version of Expected Shortfall is minimized 24 . The change that this 



introduces is known under the name of regularization, and if one choses the L2-norm 
gularizer, then the resulting method is closely related 24] to support vector 
regression 25 , 26 and robust statistics 55 



3 Regularization from market illiquidity 

To generate cash, an investor has to liquidate (part of) his portfolio. The set up of 
the portfolio optimization problem above ignored the fact that this liquidation may 
have an impact on asset prices. Let us consider a situation in which an investor 
holds a portfolio of N assets and, at time t, liquidates a fraction of such portfolio 
Wt = (wi } t, • • • , WN,t), with Wi t t representing the position of asset i liquidated at time 
t. In the following we assume that the liquidation of the portfolio w t affects prices 
in a linear way 

Pt+1 =Pt + X t - 7]W t . (6) 

Here Xt is the vector of returns, and r] is an impact parameter. Notice that investment 
is taken to move prices in the direction opposite to trading: selling {w i>t > 0) will 
cause prices to fall and buying (w i>t < 0) will push prices up. The cash flow generated 
on day t is then given by 

c t = Wf Pt+i = Wf Pt + Wf {x t - rjwt) (7) 

The first part w t ■ p t is known at time t, so risk only enters in the second part, 
w ■ x — ?7||w|| 2 , where we have dropped the subscript t to simplify the notation. 



Similarly to what is done in classical portfolio theory 56 , we consider the problem 
of finding the portfolio of minimal risk, for a given present value WiPij = wN of 
the realized cash flow. The parameter w plays the role of a normalization, and is 
customarily set to one, because risk is usually linear in the size of the portfolio. Here, 
however, the size of the portfolio matters as the impact of liquidation strategies on 
prices depends on the size. We therefore keep w as an independent parameter. In 
order to further simplify the notation, we consider = 1, Vi, so that we have the 
constraint 

Wi = wN. (8) 
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As before, we take the expected shortfall as a risk measure. The loss is now given by 
/({u>i}|{xj}) = — w ■ x + r/||w|| 2 , and we then have to find the minimum of the cost 
function 



E v [v, {u T }) = (1- P)Tv + J2 u r 



T=l 



under the constraints 



u. 



u T > Vr, 
+ v + S ^w i x^ T - 7]\\w\\ 2 > Vr, 

yj Wi = wN. 



N 



i=l 



(9) 

(10) 
(11) 
(12) 



All of the T inequality constraints contain a term that is independent of r (r = 
1, given by 

e = v — r]\\w\\ 2 . (13) 

Substitution of e + r/||u>|| 2 for v in the cost function, Eq. ([9]), and multiplication 
by 2 {i-j3)Tr] l ea ds us to the regularized expected shortfall problem proposed recently 
(see (24] Eqs. (7)-(10)): 



mm 

w.u.t 



\w\\ 2 + C 



s.t. w ■ x T + e + u T > 0; u T > 0; Vr, 
2^ Wi = wN. 



(14) 

(15) 
(16) 



with 



C 



(17) 



2(1-/3)//' 

We recognize that the term proportional to r] in Eq. ^ acts srularizer. 

To develop a simple intuitive argument, imagine that there are two portfolios 
w + and w~, each properly normalized (i.e. J2i w t = wN), with w + x T > w~x T for 
all r = 1, . . . ,t and w + x T > w~x T for at least one r. Then, when rj = 0, minimal 
Expected Shortfall would be realized by selling K units of w~ and buying K + l units 
of w + , with 7^ — ?• oo. This, as shown in (48] , is the origin of the instability in coherent 
risk measures. Such infinite returns cannot be realized, however, by liquidating a real 



8 



portfolio because prices will adjust. In the linear approximation discussed here, when 
7] > 0, the investment behavior discussed above is going to modify future returns, 
because x^t+i —> x^t+i —T]Wi, thereby eliminating the apparent arbitrage. This effect 
reflects precisely the logic behind the no- arbitrage hypothesis. 

4 Behavior of large random minimal risk portfo- 
lios under regularized expected shortfall 



We have argued that the observed instability can be alleviated by regularization [24]. 
In particular, regularized portfolio selection under Expected Shortfall is related to 
support vector regression 24 . This opens up the way to apply the existing support 



vector algorithms, duly modified, to portfolio selection. 

We will show here that, if we make an assumption on the distribution of the 
underlying data, then we can make progress by analytical means as well. We follow 
the calculation in 50 to see how the regularizer takes care of the instability. In the 



course of the calculation we make use of the powerful techniques of the statistical 
physics of random systems. Some details of the derivation are reported in the Ap- 
pendix, but these details are not necessary for the understanding of the result which 
is a very plausible extension of that in 50 . Therefore, here we just quote the result 



and discuss its consequences, namely the removal of the singularity of the risk mea- 
sure. We also refer the interested reader to the related literature within statistical 
learning theory, such as 57 60 and references therein. 



In the previous sections both statistical and financial considerations led us to a 
cost function of the form 



E[v,{ur}\ = (l-/3)Te + ^ Mr 



t]\\w\ 



where fj can be expressed in terms of C or rj. Starting from this expression, and 
averaging over the returns {xi iT } drawn from a Gaussian distribution^] via the method 
described in the Appendix, one arrives at a generalized cost function given in terms 
of three variational parameters A, q = go A 2 = "Y^wf/N, and e = eA, as in 50 



E{e, %, A) 



w 
2A 



+ A 



WoA 2 



4 Here Xi^ are taken as i.i.d. Gaussian variables with zero mean and variance 1/y/N. The latter 
ensures a meaningful limit N, T — > oo with N/T — n constant, and is also realistic for typical cases 
where TV ~ 10 3 — 10 4 . 
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where 



g(x) 



x 



-2x - 1, 



x > 
-1 < x < 
x < -1 



(19) 



The difference with respect to 50 is that now we have an additional term pro- 
portional to go- Indeed, if one considers the definition of %, the extra term fjqoA 2 
precisely maps into the term ?ji||w|| 2 added to the objective function. 



Let us now discuss how the term proportional to 



\w\ 



in the cost function pre- 



vents the instability in the portfolio optimization problem. The first order conditions 



on ( 18 ) with respect to the three variational parameters read 



- 1 



V 27r 9o 



1-/3 + 



2V7T 



dse' s sg (e + s^2q ) + 2r/A = 0, 



dse s g'(e + s^/2q ) = 0, 



(20) 
(21) 



+ t(l - 0)e - ^ + -*7= / dse- s 'g(i + sy/2q Q ) + 2f]A% = 0. (22) 



2A 2 v ' ' 2 2^ 




In the original, non-regularized problem in 50 the instability of Expected Shortfall 
was indicated by the divergence of the parameter A which we will, therefore, call 
the susceptibility here. The divergence of this susceptibility is thus the signature 
of the instability, which made it expedient to rescale the variables as e = e/A and 
Qo — <lo/A 2 in the first order conditions. Notice that the variables e and % are already 
finite This should be sufficient to conclude that all integrals over the variable s 
are finite. In order to see if a solution with divergent susceptibility can exist, we now 



let A — > oo in the first order conditions. We first note that, in order for (22) to be 



satisfied, has to remain finite as A — y oo, i.e. % = a/A with a finite. However, 



this is in contrast with a similar constraint we can deduce from equation (20), where 



we find that a solution exists only if Ay^ is finite. Indeed if we multiply all terms of 



(20) for yfqs and impose go = a/ A we see that all the terms are bounded except the 
last one which diverges as a/A. We thus conclude that no solution with divergent 
susceptibility can be found as long as fj > 0. 

The numerical solution of the first order conditions confirms this prediction. In 
figures [l] and [2] we show the behavior of go = ^ Yli w i an d of A. We can clearly 

5 The divergence of e and qo is prevented by equation ( |21| , that admits solutions only if the two 
variables arc finite. 
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observe that the divergence, which is present for fj = 0, disappears as soon as fj > 0. 
This is further confirmed by figure |3j where we show that, in the unfeasible region 
of the original problem, A diverges at fj = 0. 



11=0.005 

^—11=0.01 




0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 



N/T 

Figure 1: qo as a function of N/T for different values of fj and j3 = 0.7. 




5 Discussion 

Let us now comment on the generality of the result. Concerning the linear assump- 
tion in Eq. (J6]), which is standard in the econometric literature [6l], we observe 
that the estimate of market impact functions is a matter of active current research. 
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Fi gure 3: The susceptibility as a function of fj for the case t — 1.5 and j3 = 0.7 



Most of the empirical evidence suggests a convex shape |62|, with the price impact 
growing slower than w. In double auction markets, if one restricts attention to the 
instantaneous impact of market orders, the effect on the price depends on the shape 
of the order book. In order to discuss this case in some more detail, let Pi(p, t) be the 
density of limit orders for asset i at time t, and consider the situation where a market 
order for a quantity Wi arrives at time t. If p^t-i is the current price and Pi t t-i + %i,t 
is the price (of the transaction which occurred) just before the order arrives, then 
the price pn at which the transaction will take place is given by 



Pi.t 



dppi(p,t). 



(23) 



Pi,t-1+Xi,t 



A linear impact, as the one assumed in Eq. (|6]), then corresponds to an order book 
with a constant density of limit orders. Hence, a measure of t] is given by the 
density of the order book close to the best bid/ask. Since the density of the order 
book fluctuates and liquidity varies across assets, rj^t could also be taken as an asset 
dependent stochastic quantity. 

Then the computation of the ES can still be performed in terms of the cost 
function ([9]), but now with the T constraints 



N 



U, 



WiXi 



> 



(24) 



in place of Eq. ( 11 ). In this case the mapping to a simple L2 regularizer that we have 



laid out in section [3j is then complicated by the fact that the impact term depends 
on t and cannot be absorbed into a r independent constant. Nevertheless we do not 
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expect the essential features of the problem to change with respect to the case of a 
constant rj. 

Note furthermore that different assumptions for the market impact function lead 
to different regularizers. For example, considering the instantaneous impact and Eq. 



(23), in the presence of a bid-ask spread, we expect the price to bounce from the 
bid to the ask, depending on the direction of trading (i.e. on the sign of wi). This 
suggests a term proportional to the sign of Wi in the equation for the price, which, 
in turn, would then introduce an LI regularizer. The choice of the regularizer and 
the behavior of regularized portfolio optimization under the different choices is a 
rich subject which deserves a separate treatment. However, this choice is related to 
market impact and liquidity risk considerations. 

Notice, finally, that the Maximal Loss limit, /3 — > 1, is non-trivial. In this limit, 
the Expected Shortfall reduces to the Maximal Loss (ML), which reads 



ML(w) = max 

t=l,....T 



E 



- > WiXu 



(25) 



If we include the market impact term, i.e. if — > x^t — rjWi then 



ML(w) = vYl 



w 4 + max 

t=l,...,T 



E 



- > WiXit 



which is clearly finite, for finite rj. However, if the (3 — > 1 limit was taken in Eq. 



(17), one would finds that C — > oo, suggesting that regularization would disappear. 
The correct limit is recovered (see Appendix) by keeping T(l — (3) = 1, i.e. by taking 
(5 = 1 — l/T, and then eventually letting T — > oo. Therefore, C = T/2rj, which 
implies that for large T and small 77, liquidity considerations provide only a weak 
regularizer in the limit of Maximal Loss. 



6 Conclusion 

We have shown that considering an optimal liquidation policy for a portfolio automat- 
ically leads to regularized portfolio optimization. Liquidity considerations provide a 
way to choose the regularizer. When the market impact is assumed to be linear, we 
obtain the L2-norm as a regularizer. The typical behavior of large portfolios changes 
due to the regularization - the otherwise divergent fluctuations in the optimal solu- 
tion are now tamed. This indicates that regularized portfolio optimization is a better 
investment strategy for large portfolios than traditional portfolio optimization, which 
minimizes only the empirical risk. 
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A The replica calculation 

We present here the calculation we use to solve the following optimization problem: 
find the minimum of the cost function 

T 

E[e, {u T }} = (1 - f3)Te + ^ u T + f)\\w\\ 2 

T=l 

under the constraints 

u T > 0, 

N 

u T + e + x ijT Wi > 
i=i 

and 

Wi = wN. 

% 

The underlying process governing the returns is assumed to be i.i.d normal. We 
are using the machinery of statistical physics of random systems, and the terminology 
will be chosen accordingly. The calculation begins with regarding the cost function 
as a kind of abstract "Hamiltonian" or energy functional and introducing a fictitious 
temperature associated with the system. The reason for this is purely technical, and 
the original problem will be recovered in the zero temperature limit. 

Given a realization for the history of returns {xj jT }, the calculation proceeds by 
considering the partition function or generating functional 

%,r}) = / dYe-^\ (26) 

Jv({xi, T }) 

where 7 is the inverse temperature, and we have used the notation Y to indicate 
the set of variables, and V({x i;T }) represents the portion of phase space where all 
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constraints are satisfied. The minimum cost can then be computed as 

lim lim - lQgZ f^ }) (27) 

N^oo 7->oo N"f 

In order to compute typical properties of the ensemble, we average over the proba- 
bility distribution of returns, that is we compute the average of log Z 7 ({xj )T }). This 
can be achieved through the replica trick by exploiting the identity 

dZ n 

<loEZ> =SftT' (28) 

The replicated partition function, corresponding to the partition function of n 
copies of the system can be computed as ^\ 

/oo 71 poo T n poo N n n 

nw uu d < nn dw ° U dxa ^ 
-°° a= l ^0 r=la=1 ^-oo- =la=1 J-oo a=1 

r°° ^ n r°° _ n f I 

/ nn^/ nn^ ex p E A '£<- ffi]v ) ^ 

J r=la=l ■ / -°°t=1»=1 [ a i J 

IJexp ULiPZ (< + ^ + ' lt\ } ( 31 ) 

exp {- 7 £(1- P)Te a - 7 £ < - 7^7 E } • ( 32 ) 



X 



X 



Averaging over the quenched variables {xi >T } and introducing the overlap matrix 

6 In the following calculation we don't keep track of multiplicative factors which are constant 
and do not contribute to the final free energy. 
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X 



X 



Qa,b = w i w i one obtain^ 

Z;[x itT ] = J[De][Du][Dw][DX][DfM][D/M][DQ][DQ]e^j^ 

exp I - T 2^(1 - /3)Te a - 7 E < - ^ E < 2 } II ex P { " ^ E 

L a a,T i J t ^ a,b 

exp | ^ Q a , b (^NQ a , b - ^ W > A | 

x nexp|^^« + e a -^)|. 
We can now perform the Gaussian integral over the variables {/*"}: 

Z;[x itT ] = J[De][Du][Dw][D\][D^][DQ][DQ}expl^2 xa (J2< ~ wN ^ 

x exp | - 7 5^(1 - /3)T6 a - 7 J>« - 777 ^ < 2 1 exp 1 £ 4,6 ( iVQ a , b 

x ri ex p I 4 e « + e ° - ^) ^ k + eb - ^) [ 

t I a,b J 

x exp j-^tr logQ j • 

7 The variables Xi tT are assumed to have zero average and variance 1/N. Therefore, to leading 
order in N, 



(33) 



= 1 + ^ r t + --- (34) 
= e w r v(l + ...) (35) 

where I\ r = i ^ a fi^wf and . . . stands for higher order terms in an expansion in powers of 1/N. In 
other words, as long as the variance of Xi^ T is well defined, the results carry through irrespective of 
the specific distribution of Xj. T (i.e. no assumption of Gaussian returns is needed). The calculation 
is performed for the simple case of independent returns. The case of returns with a given correlation 
matrix can also be dealt with. 
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We are now allowed to perform a Gaussian integration over the variables {u>"}, which 
is going to bring the inverse of the operator Q a ^ + 7^5 a ,6 into the game: 

Z^[x iiT ] = J[Dt][Du][D\][D^[DQ}[DQ]^ V \^2-w\ a N 
{-7£(l-/*)Te°-7£«?| 

L a a,T J 

x ri ex p I -\ e « + ea - ^) <£j k + - ^ } 

T (. a,b ) 

x ex P j E Qo,,bNQa,b | exp I - ^tr log Q j exp j - ^ log 2 



x exp 



x exp | - y tr log(Q + 7 ??5 a ,&) | exp j ^ ^ A a (&, )6 + -ffj5 a ^ \ b J . 
Integrating now over the {A a } we obtain 

Z^[ Xi , T ] = j [De[Du\ [Dfi] [DQ] [DQ] exp J - 7 ^(1 - /3)Te a - 7 J]< 

x ri ex p {-^ e « + ea - ^) <c k + - } 

t I a,6 J 

x exp | iV ^ Q a ,bQa,b | exp | - ^tr log Q J exp j - ^ log 2 



X 



exp |-ytrlog(<5 + 7^ a ,6)| exp j -A% 2 ^(Qa,6 + 7^a,fe) i • 



a,b 

Introducing the variables = /i" — u b T and = /i" + an d integrating over the 
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{z®} one is left with 
Z^[x itT ] = J[De][DQ][DQ] 

x exp i -Nw 2 Y,(Qa,b + l¥a, b ) - 7(1 - P) Yl Tea + N E 

L a,6 a a, ft 

x exp < -Tn log 7 - -tr log Q - — tr log(Q + 7^5) - — log 2 ^ 



x exp{TlogZ 7 ({e°,Q})}, 
where we have defined 



Z,({e\Q}) = [ndy a e W l-lj2(y a -t a )Q^y b -t b )\ 

a L a,b J 

x exp j 7 ]^R/ a )j. 



We now take the replica symmetric (RS) ansatz 

Qa,b ~ 



q±, a = b 
g , a ^ b 



Qa. 



b 



qi, a = b 
g , a^b. 



and we define the susceptibility Ag = gi — go as well as Ag = gi — go- For Q~ 
then have 

!_ / (Ag-g )/(Ag) 2 + O(n), a = 6 
^ \ -g /(Ag) 2 + O(n), a ^6 

The effective partition function Z^({e a ,Q}) reads 

Z 7 ({e a , iT, Q}) = /n^ aex p{-^E^>4 

J a I 0,6 J 

X exp j 7 ^(:r a + e a )fl(-:r a -e a ) j , 
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where we have defined x a = y a — e a . By introducing a Gaussian variable s with 
measure dP qo (s) = e~ s ^ 2q " we obtain, in the limit n — > 0, 

ilog(Z 7 ({ e a ,Q}) = ^|- + y rfP, ( S ) log5 7 (s,e,Ag), 

where 

£> 7 (s,e, Ag) = J dx exp | — - ^ - — h 7(x + e)^(-i - e] 
If we also consider that 

trlogQ = n(log Ag + gO/Ag) 

and 

tr log(Q + -yfj8 afi ) = n(log(Ag + 777) + go/(Ag + 777)), 
we finally obtain the free energy 

7F(q ,Aq,q ,Aq,e) _a*,aa_,a_a* ..2 



v g Ag + g Ag + AgAg - u> 2 (Ag + 777) - 7 t(l - /3)e 

- t\o gl + t J dP qo (s) log B^e, s, R,Aq)-^ log Aq 

log2 1 / g 
- « lo g( A <? + 777) + 



2 2 V Ag + 77/y ' 

where we have put T = tiV. From the saddle point equations for g and Ag we get 

A . - 1 
Ag + 777 



2Ag 
w 2 - g 



2(Ag)2 • 



Exploiting these relations the free energy becomes 
- 7 /(e,g 0) Ag) = = l - - tlo gl - jt{l - $)e 



go - w 2 



f 1 - 1 

+ / / dP qQ (s) log B 7 (e,s,R, Ag) • - logAg • y/g . 



2Ag 
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Notice that 



Ag 



2N ^ K 1 



(39) 



is the squared distance between two approximate solution of the optimization prob- 
lem, drawn with a Gibbs measure with energy E. As 7 — > 00 the Gibbs measure gets 
more and more peaked on the optimal solution. If the latter is unique, we expect 
Ag — > 0. Indeed, given that the measure is nearly Gaussian, we expect Aq ~ I/7. 
Hence, in the large 7 limit, it is natural to rescale Aq = A/7 keeping e and go 
independent of 7. In this limit we obtain the energy function 



E 



t{l-(3)e 



w 



2A 



dx 



v /2~7rg " 



-(x-6) 2 /(2go) 



x + 



A 



+ 



2A 



dx 



y/2TTq { 



e -( x -e) 2 /(2q ) x 2_ 



We now define x — x/A, e = e/A and q = q /A 2 . The reason for this change of 
variables is that we want to expose the singular behavior at the phase transition in 
terms of a single divergent quantity^ A. Hence, we anticipate that e and % are going 
to attain finite values at the transition. In terms of the rescaled variables, we have 



E{e,q ,A) 



w 
2A 



+ A 



t(l-/3)e 



2 2v/tF 



2 / 

dse~ s g{e + sy2g 



+ W0A 2 



where 



0. 



x 



x > 
-1 < x < 
x < -1 



-2x - 1 

and go and e are the solutions of the saddle point equations 

t 



- 1 + 



v /27rg ~ 



dse s sg'(e + s- 



2g ) + 2f,A = 0, 



dse s g'{l + sa/2^o 



(40) 



(41) 
(42) 



dse s g(e + 5^2^) + 2?7Ago = 0. 



(43) 



3 It helps to note that A is the susceptibility. 
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B The Maximal Loss Problem 



We show here how to recover the correct — > 1 limit leading to the Maximal Loss 
problem. The problem of finding the set of weights that minimizes the Maximal Loss 



(25) can be cast into that of finding the minimum of the cost function 

E[u] = u 



(44) 



under the constraints 



and 



u 



i 

Wi = wN. 



Let us show how this can be recovered starting from the general problem for the 
Expected Shortfall 



T=l 



E[e,{u T }} = (l-P)Te + J2 

u T > Vr, 

TV 

+ e + 2J > Vr, 



"t 



E 



wN. 



(45) 
(46) 
(47) 

(48) 



The first observation is that, for e > ML, (47) is satisfied for any set of {ui :T } 



satisfying (46). The minimum of the cost function can then be obtained by taking e 
equal to the Maximal Loss and Ui yT = Vz, r. By comparing the resulting expression 



for (45) with (44), we can see that the two are equivalent if we keep T(l — j3) = 1. If 
we now introduce the regularization, the cost function for the Maximal Loss problem 
reads 

E[u,{wi}} =u+— \\wf. 

As in section |3j an equivalent expression can be obtained by introducing the effect 
of the price impact. The two approaches are equivalent once we have taken 



C 



T 



(49) 
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Notice that one can derive Eq. (49) also by taking 1 — /3 = 1/T in (17), which 
is indeed the appropriate confidence level for maximal loss, because in a finite time 
window of T points, the worst possible outcome occurs with probability 1-/3 = 1/T. 
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